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ABSTRACT 



Least-squares estimates of regression coefficients 
are extremely sensitive to large errors in even a 
single data point. Frequently, an ad-hoc procedure is 
used to weight the data in a manner to alleviate the 
effects of extreme observations. 

This thesis is a study of the effectiveness of an 
iterative regression method using weights derived 
through maximum-likelihood arguments. Actual weights 
are calculated on the assumption of Cauchy-distributed 
error as a worst-case situation in which the errors 
have long, fat tails and no finite moments. 
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I- INTRODUCT ION 



A. LEAST-SQUARES LINEAR REGRESSION 



It is often desirable to model the behavior of a 
response variable as a function of another variable, 
sometimes referred to as a "carrier", since it carries 
information about the dependent variable. In the simplest 
case, the equation 



V = b + b X 

i 0 1 i 



is fitted to a set of data points (x ,y ) . Usually this is 

i i 

done using the "least-squares" procedure which selects the 



coefficients b and b that mijiimize the sum of squared 
0 1 

residuals, r., defined as 



A A 

r = y - b - bx 
i i 0 1 i 



The procedure is based on the linear model 

y = b + bx + 6 
i 0 I i i 

where the 6 are independent and identically distributed 
i 

random variables with mean zero and constant (but unknown) 
variance. Then, by the Gauss-Markov Theorem, the estimates 
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b and b found by solving the “normal equations" 



=nb + blx 
i 0 1 i 



>xy = b>x + b>x2 
i i 0 i 1 i 



are the best (minimum variance) linear unbiased estimates of 

b and b . 

0 1 



B, DEFICI3NCIES OF LEAST-SQUARES 



the least-squares procedure works very well when the e 

i 

are short-tailed and the other assumptions about the error 

distribution hold. If, however, the error distribution has 

very long tails, implying that extreme observations may well 

occur, least-squares quickly demonstrates its sensitivity to 

large random error. In real data, the analyst very rarely 

has a hint as to the nature of the true distribution of the 

e . Heuristic arguments appealing to the central limit 
i 

theorem are frequently made along the line that there are 



several sources of 
aggregate, will be 
for least-squares, 
long-tailed (such 
distribution) , then 
normal . 



variability in the data, which, in the 
"normally" distributed and thus suitable 
Unfortunately, if any of the errors are 
as may be described by the Cauchy 
their aggregate effect will not be 
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Figure 10 and Figure 11 in the Appendix are histograms 

of least-squares estimates for b and b in the linear model 

0 1 

y = 2 2 + 2 (X -x) . 

i i 

The e for these estimates came from a Normal, or Gaussian, 
i 

distribution, for which the least-squares procedure is 

optimal. Figure 12 and figure 13 show the effects of 
Cauchy-distributed error on the estimates of these 
coefficients. The end cells contain points which would 
otherwise be off the scale of the histogram, and emphasize 
that large errors in estimating the coefficients are quite 
possible when using least-squares. A uniform distribution's 
adverse effect upon the coefficient estimates is shown in 
Figure 14 and Figure 15. 

A function of Cauchy variates, 

C/100 

7 = Ce 

is the error density associated with the widel y- varying 
coefficient estimates histogramraed in Figure 16 and figure 
17. This distribution is virtually symmetric between ±12, 
but has a long tail extending toward +oo . Another 
distribution of error, a function of normal variates, 

N+0.01N2 

Z = Ne 

has high positive skewness, a little bias, and an adverse 

effect upon the least- squares estimates for b and b as 

0 1 

shown in Figure 18 and Figure 19. 

All of these cases demonstrate that the variances of the 



10 



« t 






coefficient estimates may be drastically increased by the 
presence of no n-gaussian , and especially long-tailed, 
distributions of error. While the bulk of the estimates do 
indeed fall near the actual values, there is clearly an 
unacceptable probability of obtaining an extreme estimate 
when using the least-squares procedure. 



C. USE OF THE CAUCHY DISTRIBUTION 



Data disturbed by Cauchy-distributed error, with long, 
thick tails and lack of finite moments, may be considered an 
extremely difficult case for regression techniques to treat 
reliably. A procedure that works well for data subjected to 
such extremely straggly-tailed errors can reasonably be 
expected to work well, though not necessarily optimally, in 
many curve-fiting situations. This thesis uses 
maximum- liklihood estimates for regression coefficients to 
develop a robust regression procedure, then further assumes 
a Cauchy-distributed error to apply a specific technique to 
a series of controlled regression problems. 



II. SIHG LE-- CARRIER ROB OS T REG RESSI ON 



A. MAXIMUM-IIKELIHOOD ESTIMATORS 



The procedure to be presented is based upon the linear 
model 

y = b + b (X -X) + 6 

L 0 I i i 

The are assumed to be independent, identically 

i 

distributed random variables centered at zero with spread 
(a scale parameter) and having a density of the form 

Hr) 

The probability for any single observation y . tie 

expressed as 



^y - b - b (X -x)\ 1 

P(I = y ) = f f i 0 1 i dy 

i I ^ h 
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The likelihood function for n observations is the product of 
n of the above probilities. Taking logarithms, the 
log-likelihood function is then 

* * ^ /y - b - b (X -X) 

^ In f / _i 2___i 1 



j - n In ^ 



Partial derivatives are taken with respect to b 
I , and all set equal to zero to find the b^ ,b^ 
which maximize L. Using r_ above, ip (x) is defined 



A 




and 

as 



and 

I 



ip (X) 



f • 

f 



A in f(x) 
o f 



f 



the three equations obtained from setting the partial 
derivatives of L to zero may be written as 






b - b {x -x) 
0 ^ i 

r. 




0 





b - b (X -X) 

0 ^ i 



(x.-x) 

1 



0 
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b - b {x -x)\ 
Q ^ i 1 

I 




b - b (X. -X) ) 
0 l 1 



1 



n 



+ 






0 
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This system of non-linear equations may be solved 

iterativaly. Defining as 

i 





(j) 

w = 


- (I 




^ .n 






i 


T 


\h-^ 






where the 


superscript 


(j) 


refers 


to the number 


of 


iterations. 


The equations 


at 


th 

the j 


iteration are 





_ (j) A A 

> w (y - b - b (X -X) ) = 0 

i i 0 1 i 



y w 




(x.-x) (y.- 
1 1 



b (x.-x) ) 

0 X 1 



0 





1 

n Z 




[r. 



(j-1) 



]2 



The first two equations are simply weighted least-squares 
normal equations which may be solved by standard iterative 
weighted least-squares algorithms in which the weights for 
each subsequent iteration are calculated from the above 

. ^ (j) 

expression for w 

i 

Assuming the error to be Cauchy-distributed, the weighting 
formula becomes 



(j) 

w 



i 




2 



+ 
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B. INITIAL ESTIMATES 



It is necessary to begin the iterative process .with an 
initial estimate, or guess, of the values of the 
coefficients. A robust estimate suggested by D. F. Andrews 
[1] using the median provides an estimate which is 
insensitive to arbitrarily large disturbances in up to 25 % 
of the data 

The first coefficient, b (which corresponds to the mean 

0 

of the y in least-squares estimation) is estimated by the 
i 

median of the y : 

i 



b = y. . 
0 1 



Next, the carriers, (x -x) , are ordered and then broken up 

i 

into three groups of approximately equal size. Of interest 



are the upper group of carriers, x , the lower carriers, x , 

U L 



and the y corresponding to the (x -x) in each group (y and 
i i u 

V- 

The estimate for b^ is a rough slope computed from the 
medians of the four groups: 



A 

b 

1 



y 

u 

1 

X 

u 



I 




I 

X 

1 
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The median of the absolute values of the residuals from 
these estimates is the initial guess for i ; 

/ = median |y - b -b (x -x) | 

^0 i 0 1 i 

o 

( 1 ) 

Weights w are calculated from the residuals and i . The 

j 0 

algorithm then proceeds until the values of the coefficients 
stabilize . 



C. SUMMARY OF PROCECORE 



1 . Ov er all Effec t 



Figure 1 is a typical scatterplot of data which 
includes extreme observations, or "outliers’*, and sketches 
of representative least-squares and robust fits. The effect 
of the weighting procedure is to pull the extreme 
observations in closer to the bulk of the data, reducing 
their tendency to distort the fit (note least-sgua res line) . 
It should be noted that both the response variable and the 
carriers are weighted in this technique. 
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rfif 









Least-squares fit 



Robust fit 



Figure 1 - FITS WITH OUTLIERS 
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2 . 



Solution Not Unique 



There are cases in which the robust procedure may 

not converge to a single global solution. Since the 

solution to the weighted normal equations is actually the 

solution to the three non-linear equations obtained by 

setting the partial derivatives of equal to zero, there 

exists the possibility of converging to a local solution not 

optimizing b and b . Figure 2 is an example of a local 
0 1 

solution. The scatterplot represents data which actually 



has two separate 
attendance, where 
and Saturdays) . 
the two groups of 
also split the 
clusters if either 
process to treat i 



means (the data might be drive-in 
observations were made only on Wedn 
A least-squares fit approximately 
points as indicated. A robust fi 
data, but could converge to one of t 
is sparse enough to cause the wei 
ts points as outliers. 



movie 
esdays 
splits 
t may 
he two 
ghting 
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Robust fit 



X 




X 



X 



Figure 2 - 



LOCAL SOLUTION 



3, 



Algorith m 



The following flow chart depicts the algorithm for 
the Cauchy- weight ing regression method. The criteria for 
convergence (change in both coefficients of less than 0.01% 
from one iteration to the next) was somewhat arbitrary, but 
was set to meet practical expectations inanalysis problems 
and not consume excessive amounts of computer time. 




Figure 3 - ALGORTIHM FLOWCHART 
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D. INADEQUACY OF R2 



One of the measures of adequacy of fit for least-squares 
regression is R2, the amount of variance explained by the 
regression. It is the ratio 

^ (y. - y) 2 
1 

R2 = 

^ (y. - y)2 
1 



where 



y = b + b (X -X) . 

i 0 1 i 

For least-squares, R2 is a fraction between 0 and 1, but for 
a robust procedure, the above ratio may exceed 1. This 
occurs when the robust fit is "farther” from the mean of the 
data than the least-squares fit. 

Consider the following set of observations. 

y 3.75 6.00 7.00 8.00 10.25 

X 1.00 2.00 4.00 6.00 7.00 

The mean of the y is 7.00 and a least-squares fit of the 

i 

A 

model y = b + b x to the data yields b = 3.385, 

0 1 0 

b = 0.094 and R2 = 0.919. A robust fit would reduce the 

1 

effects of observations (2.00,6.00) and (6.00,8.00) since 
they lie somewhat off the line through the other three 
points. A robust procedure, bringing these "extreme" 
observations in closer to the rest of the data, might well 
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A A 

yield coefficients of b = 3.000 and b = 1.000. From these 

coefficients and the data, R2 = 1.124. Figure 4 is a 
scatterplot of the observations with drawings of the actual 
least-squares fit and the postulated robust fit. Note that 
the two fits are very close, but more importantly, that the 

robust fit is rotated so that (y - y) ^ for the robust fit 

i 

is everywhere greater than or equal to the same measure for 
the least-squares fit. 
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Figure 4 - SITUATION IN WHICH R2 EXCEEDS 1 



23 



Mors generally, as in Figure 1, R2 as calculated above 
is small due to the large deviations from the mean caused by 
outliers. When a response variable has only a single 
carrier, a plot of the data and the fitted line provide a 
visual evaluation of the fit. In multivariable cases, it is 
usually impossible to plot the data meaningfully, and the 
good fit of the robust line to the bulk of the data could be 
belied by inappropriately using R2 as a measure of the fit. 
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III. EXPERIMENTAL PROCEDURE FOR A SINGLE CARRIER 



A. GENERAL DESCRIPTION 



A ’’true" model 

y = 22 + 2(x -X) + e 

i i i 

was established to enable comparisons of the Cauchy 

weighting technique and least-squares. The x were the 

i 

integers 1 through 20, and random variates were selected 

from one of five controlled error distributions to produce 

20 observations of the y . The y were then regressed on 

i i 

the (x -X) to obtain estimates for b and b which could be 
i 0 1 

compared to the actual values. 



One thousand replications were made for each 

distribution and each method. Histograms were constructed 

for both b and b to reveal their distributions. 

0 1 

Preliminary runs indicated that most problems converged 

(both coefficients changed less than 0.01% from one 
iteration to the next) within 10 iterations. To reduce the 
amount of time to perform the experiment, the 
Cauchy-we ighting iterations were terminated no later than 
the seventh iteration. Values of the coefficients were 
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I 











I 




recorded at the fourth iteration to see if there were 
significant changes between that and the final iteration. 
If the problem converged early, data normally collected at a 
later point were assigned the stabilized values. 

B. EEEOE DISTEIBOTIONS 



Five controlled error distributions were used to disturb 
the observat ions . The first, the Gaussian or "Normal" 

distribution with mean zero was matched to the second, a 
Cauchy distribution. This was done by integrating the 

th 

Cuachy density centered at zero to find the 75 quantile, 

giving 1 as a measure of the spread of the distribution. 

th 

Since the corresponding Normal (0,1) 75 quantile is 0.6745, 

a Normal distribution with standard deviation 1.4826 will 
have the same interquartile range as a Cauchy distribution 
with spread parameter 1. The third source of error was a 
uniform distribution with mean zero and variance matched to 
the above Normal, giving it a range of -13.1886 to 13.1886. 

The "V" density is a function of Cauchy variates C: 

C/100 

V = Ce 

It is positively skewed, but virtually symmetric between -18 
and +18 with a very pronounced central spike. Figure 5 is a 
histogram of 2000 "V" variates. 
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The final test density is a function of Normal variates 
N with mean zero and variance 1. 

N+0. 01N2 

Z = Ne 

It is positively skewed and slightly biased. A histogram of 
2000 "Z" variates is shown in Figure 6. 
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IV. RESULTS OF SINGLE-CARRIER EXPERIMENT 



A. LEAST-SQUARES ADVANTAGES 



Summary statistics for the distributions of b and b 

0 1 

for the single-carrier experiment are shown in Figures 7 and 

8. Looking at means and standard deviations, least-squares 
estimates (maximum-liklihood estimates for normal-error 
data) are better for normally-distributed cases than the 
Cauchy method. Interestingly, least-squares is also 
noticeably better when the error comes from a uniform 
distribution. This result could be explained by the 
relatively broad area in which the data points may fall for 

the uniform error with respect to the range of the (x -x) 

i 



used, and the susceptibility of the Cauchy method 
convergence to local solutions. 
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Figure 7 - SOHMARY OF SINGLE-CARRIER b DISTRIBUTION 
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Figure 8 - SUMMARY OF SINGLE-CARRI3R DISTRIBUTION 
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Figure 9 is a diagram of the region in which data 
points may fall when the error is uniform between 
±13. 1886 . Since the observations may lie anywhere in the 
region with equal probability, the weighting process may not 
be able to clearly discriminate which points are outliers. 
Chance alignments of a series of points could determine a 
local optimum upon which the Cauchy-likelihood method would 
converge. While other distributions have longer tails, the 
bulk of their variates fall within a relatively small 
distance of their center, better defining a mean trend and 
clearly differentiating outliers from the rest of the 
observat ions . 
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Figure 9 - EQUALLY-LIKELY REGION FOR UNIFORM DATA 
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B. CAUCHY METHOD ADVANTAGES 



Comparing means and standard deviations for the two 
methods, the Cauchy-likelihood procedure is clearly the more 
reliable technique when the errors have long tails. Maximum 
and minimum estimates of the coefficients are closer to 
their true values when the Cauchy technique is used, and it 
never produces extreme estimates. There is little 
difference in the estimates from the fourth to the seventh 
iterations. 



C. SIMILARITIES 



The means and medians for both methods are not 
significantly different. The coefficient estimates between 
th th 

the- 25 and 75 quantiles are virtually the same over all 

distributions, the Cauchy-based method doing better for the 
long-tailed distributions and the normal having some 
advantage principally when the error is uniform. Even 
th th 

between the 10 and 90 quantiles, the two procedures 
yield very similar results. 
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V. REG RESSION ON TWO INDEP END ENT CARRI ERS 



A. MULTIVARIATE MODEL 



For two independent carriers, the linear model is 
assumed zo be of the form 



y. 

1 



b + b (X -X ) 

0 1 i1 1 



+ 



b 

2 




+ 



e . 
1 



The regression can be expressed in matrix terms as 

I = XB + B. 



I is an nxl matrix of n response variable observations, X is 
an nxp matrix having all 1’s in the first column, the n 
observations of the first carrier in the second column, and 
the n observations of each of the remaining p-2 carriers in 
each of the remaining columns. B is a px1 matrix of 
coefficients, b ,b ,b ,...b , and E is an nxn matrix of 

012 p-l 

unknown random errors, independent and identically 
distributed, centered at zero and having constant spread. 



Using a prime (') to designate the transpose of a 
matrix, the least-squares normal equations are 

A 



X* X 



X' XB 




im 




The weighted normal equations are 



(HX) • Y = (WX) »XB 



where B is an nxn matrix having as its diagonal elements w 

ii 

th (k) 

the k iteration weights, w , and zeros elsewhere. The 

i 



weighted normal equations are equivalent to a system of p 



equations in p unknowns (the b , i = 0,1,..., p-1) which can 

i 



be written as 



0 = VB 



and is easily solved for B. 



B. MODIFICATION TO INITIAL ESTIMATION PRODEDURE 



Multiple-carrier regression problems require a 
modification to the initial estimate procedure to ensure 
that any interdependence among the carriers is removed prior 
to estimating the effects of the carriers on the response 
variable. D. F. Andrews [1] has suggested a rather 
time-consumming method applying a robust sweep operator to 
the columns of the X matrix in an iterative process. An 
alternate method inspired by Hosteller and Tukey [6] 
sequentially regresses the carriers on each of their 
predecessors in the X-matrix to eliminate unwanted effects. 
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Multiple regression may be viewed as a sequence of 
single-carrier regressions in which the dependent variable, 
y, is regressed on the first carrier alone according to the 
model 



y = y X + residual 

1 1 

Let y ("y adjusted for x ) be the residual after the 
; 1 1 

effects of X are removed: 

1 

y = y - V X 
; 1 11 

This residual is set aside while the effect of x on x is 

1 2 

removed using the model 



X = d X + X , 

2 2 ; 1 1 2 ; 1 



X being "x adjusted for x ” . The residual of y 
2;1 2 1 

A 

( Y ) is then regressed on x to find b in the model 
;1 2;1 2 



y = b X + residual 
;1 2 2;1 



Substituting for y and x , 

; 1 2;1 

that 



b = ( y - d b ) so 

1 2;X 2 



y=bx +bx + residual 
11 2 2 



For a model having a mean effect 

y = b + b (X -X ) + b (X -x ) + residual 

i 0 1 i1 1 2 ' i2 2 
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A 

In practice, b is found immediately, (from the median of 

0 

the y , as before) and its effects not removed for 
i 

computational considerations. It is not important to remove 

the mean effect since it is independent of the carrier 

effects that must be removed. 

/ 



Estimates for y are found using the median estimate 

i 

described above; 



A 

y. 



Y - Y. 
a u a 1 



X - X 

a ui a li 



where the subscript indicates the quantities have been 
a 

adjusted for all proceeding carriers. For example, the 



estimate for y would require that v and x both be 

3 ‘i i3 

adjusted for the effects of carrier x and carrier x . A 

1 2;1 



similar procedure finds the d coefficients for the j 

j;a 



th 



carrier regressed on its predecessors. The y and d may 

i j ;a 

then be arranged in a system of equations in the b and 

i 

subsequently solved to yield the coefficients for the 
desired model. 
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c. 



THO-CAERIER EXPERIMENT 



The twc-carrier experiment 

one-carrier tests. Coefficients b , 

0 



was analagous to the 

b and b were fixed to 
1 2 



establish a known model 



y = 13 + 3 (X -X ) - 0.5(x -x ) 

i i1 1 i2 2 

The X were the integers 1 through 20 in ascending order; 
i1 

the X were the same integers shuffled to establish 
i2 

independence in the X-matrix columns. The "true" y were 

i 

then calculated and subsequently disturbed by the same 



additive error e as in the one-carrier case. 

i 



Since the single-carrier experiment showed little change 
in the values of the coefficients from the fourth iteration 
to the seventh, the two-carrier iterations were terminated 
after four cycles (or convergence) for each of 1000 
replications for each of the five distributions. Only final 
values were recorded since the initial guesses tended to be 
somewhat unstable in the first experiment. 



D. RESULTS OF THE TWO-CARRIER EXPERIMENT 



The results of the second experiment are summarized in 
Figures 20, 21 and 22 in the Appendix. The estimates of the 
coefficients parallel the single-carrier cases exactly, with 
the exception of the Cauchy method applied to uniform 
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disturbances. The standard deviations of the Cauchy 
estimates for the coefficients of the uniform-disturbed data 
are slightly lower than in the single-carrier case, in 
contrast to the general trend for the standard deviations to 
be higher for the two-carrier problems. While there seems 
to be some interaction between the carriers which raises the 
standard deviations in general, the use of two carriers may 
be reducing the tendency of the Cauchy method to converge to 
a local solution. 
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VI. CONCLUSIONS 



The robust method developed and tested in this paper 
demonstrates extremely stable behavior over a variety of 
distributions of random error. Traditional least-squares 
estimation, on the other hand, is subject to potentially 
large errors in its estimates of regression coefficients. 
The method based on Cauchy- likelihood weighting has 
consistently smaller error when outliers are present and 
only slightly larger errors (though never any extreme 
errors) when the error distribution is closer to the Normal 
distribution . 

The Cauchy-likelihood estimates appear to be very 

slightly biased. The centers of the y tend to be estimated 

i 

too high, while the slopes of the carriers are consistently 
low. Possibly, if the experiment were run again with the 
signs of the error terras reversed, the apparent biasing 
would also reverse to imply that the procedure is robustly 
unbiased . 

There are two drawbacks to the Cauchy method. The first 
is its requirement for more calculations and intermediate 
storage. The initial estimates of the coefficients alone 
require more computer assets than least-squares needs for a 
complete, though possibly erroneous, solution. As a general 
rule, the robust Cauchy-likelihood procedure requires twice 
the data storage capacity and five to six times as long to 
run as a basic least-squares routine. Clearly, the large 
reduction in risk for obtaining seriously inaccurate 
estimates of regression coefficients warrants the use of the 
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Cauchy, or seme other robust procedure in every-day data 
analysis, even with the increase in computer requirements. 



The other problem with using the Cauchy- likelihood 
method is the possible convergence upon a local solution. 
It should be noted, however, that traditional least-squares 
will also produce erroneous results when used under the same 
conditions which would cause the Cauchy-based technique to 
stabilize at a local solution. 
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APPENDIX A 



HISTOGRAMS OF LEAST-SQUARES ESTIMATES 
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Figure 10 - LEAST-SQUARES ESTIMATE OF b WITH NORMAL ERROR 
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Figure 11 - LEAST-SQUARES ESTIMATE OF b WITH NORMAL ERROR 
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Figure 1 'l - LEAST-SQUARES ESTIMATE OF b WITH UNIFORM ERROR 
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Figure 15 - LEAST-SQUARES ESTIMATE OF b WITH UNIFORM ERROR 
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Figure 16 - LEAST-SQUARES ESTIMATE OF b WITH "V" ERROR 
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Figure 17 - LEAST-SQUARES ESTIMATE OF b WITH "V ERROR 
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Figure 18 - LEAST-SQUARES ESTIMATE OF b WITH "Z" ERROR 
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Figure 19 - LEAST-SQUARES ESTIMATE OF b WITd "Z» ERROR 
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rrue value 



13 





Normal 


Ca uchy 


Uniform 


ny II 


"Z" 


Mean 

Cauchy 

Least-squares 


13. 02 
1 3. 00 


13.01 

298 


13.03 

12.99 


13.01 
1 .4E9 


13.18 

14.8 


Std. Dev. 
Cauchy 

Least-squares 


.417 
. 33 1 


.406 

9240 


2.71 

1.69 


.405 

1.8E10 


. 358 
2.78 


Minimum 

Cauchy 

Least-squares 


1 1. 66 
1 2.02 


11.52 

-4063 


4.68 

8.24 


11.49 

3.83 


12. 69 
1 2. 86 


. 10 Quantile 
Cauchy 

Least-squares 


1 2. 51 
1 2. 59 


12.54 

9.85 


9.54 

10.87 


12.55 

11.57 


12.83 
1 3.47 


.25 Quantile 
Cauchy 

Least-squares 


12.74 
1 2.77 


12.76 
1 1 .88 


11.21 

11.78 


12.75 

12.34 


1 2. 93 
1 3. 86 


.50 Quantile 
Cauchy 

Least-squares 


13. 02 
1 3.00 


13.00 

13.02 


13.02 
12. 97 


12.99 

13.25 


1 3. 08 
14.40 


.75 Quantile 
Cauchy 

Least-squares 


13.31 
1 3. 23 


13.27 

14.04 


14.87 

14.18 


13.26 

14.58 


13.32 

15.16 


.90 Quantile 
Cauchy 

Least-squares 


13.56 
13. 42 


13.52 
16 . 15 


16.58 

15.21 


13.51 

18.54 


13.64 

16.18 


Maximum 

Cauchy 

Least-squares 


14. 35 
14.04 


15.01 

29.22 


20. 18 
18.30 


14.63 
2.3E1 1 


1 5.35 
89. 14 



Figure 20 - SUMMARY OF TWO-CARRIER b DISTRIBUTION 
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True value = 3 





Normal 


Ca uchy 
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ny 11 


"Z” 


Mean 

Cauchy 
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Figure 21 - SUKMARY OF TWO-CARRIER DISTRIBUTION 
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Figure 22 - SUMMARY OF TWO-CARRIER b DISTRIBUTION 
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